Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add generalized async signal handling #268

Closed
wants to merge 8 commits into from

Conversation

lia-viam
Copy link
Contributor

Extends the functionality of SignalManager to allow waiting on async signals, namely SIGINT, SIGTERM, SIGUSR1 and SIGUSR2, with the option to set fatal or non-fatal handlers.

This is mainly marked WIP because of the unit tests--I think both Boost.Test and the debugger don't love signal handling, and I was not able to write a test to check that unhandled signals are correctly ignored. The idea would be to construct eg a default signal handler and pass it an unhandled signal like SIGUSR1, but I'm not sure if/how it would be possible to meaningfully assert on the state of the std::thread after it receives an uncaught signal.

Also maybe the condition_variable stuff could be simplified or is not entirely necessary, not used to working with them.

@lia-viam lia-viam requested a review from acmorrow July 19, 2024 20:00
@lia-viam lia-viam requested a review from a team as a code owner July 19, 2024 20:00
@lia-viam lia-viam requested review from njooma and stuqdog and removed request for a team July 19, 2024 20:00
@acmorrow
Copy link
Member

I had a look at this a few times and kept bouncing off it. It took me a bit to sort out why it wasn't connecting for me, but I think I got there:

As things are currently put together, the SignalManager is entirely an implementation detail of the ModuleService. It isn't exposed to the module author such that they can interact with the signal handling meaningfully. At the same time, the SignalManager class isn't really meaningful outside of the context of ModuleService, since the job of the class is to ensure that the module follows mandated behavior for certain signals. Finally, the current usage of ModuleService has it happening all a little too late for comfort. The construction of the ModuleService happens fairly late in the game. At minimum, a ModelRegistration must have been created, as well as Model objects, etc. But the sigmask application and wait happen only once serve is called.

Maybe a better thing to do here is to leave ModuleService and SignalManager as they are, and pivot to writing a new ModuleService that both improves the flow of control and provides a signal management API that both enforces the requirements and sets things up for success early?

int main(int argc, char* argv[]) try {

    // Signals get masked here.
    // INT/TERM have default handlers.
    // A background thread is running pthread_sigwait
    auto service = ModuleService::create();
    
    service->register_server<SomeServer>(
        { "viam", "some", "server" },
        &impl::MySomeServer::validate,
        &impl::MySomeServer::create,
    );

    service->register_server<AnotherServer>(
        { "viam", "another", "server" },
        &impl::MyAnotherServer::validate,
        &impl::MyAnotherServer::create,
    );

    auto log_int_term_signals = [](ModuleService& service, int signal) {
        std::cerr << "Saw INT or TERM";
    };

    // This stacks above the existing handlers for INT/TERM, which cannot be removed.
    // If called before ModuleService::serve those will exit. If called after, the `serve` call below will unblock.
    service->push_signal_handler({SIGINT, SIGTERM}, log_int_term_signals);

    // This replaces any existing handler and returns prior. SIGINT/SIGTERM are disallowed
    auto old_handlers = service->set_signal_handler({SIGUSR1, SIGUSR2}, [](ModuleService& service, int) {
        // Under ASAN, report leaks on SIGUSR1/2
#if defined(__has_feature)
#if __has_feature(address_sanitizer)
        __lsan_do_recoverable_leak_check();
#endif
#endif
    });

    auto execution_parameters = service->parse_command_line(argc, argv);
    service->serve(std::move(execution_parameters));

    return EXIT_SUCCESS;

} catch (...) {
    // ... diagnostics
    return EXIT_FAILURE;
}

This is mostly me just thinking aloud. I think if we wanted to go this way we shouldn't change the existing ModuleService because it would break too many existing modules, but instead write a new one.

@lia-viam lia-viam closed this Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants